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DETAILED ACTION 

Response to Arguments 

1 . Applicant's arguments filed 05/04/07 have been fully considered but they are not 
persuasive. 

2. Applicant argues that Thelen et al., do not teach selecting at least one signal to 
be transmitted to a server, from the audio signal to be recognized and a signal 
indication calculated modeling parameter (Amendment, pages 8 - 10). 

The examiner disagrees, Thelen et al., teach that the speech controller direct 
part (or all) of the speech input signal to the server station if a performance indicator for 
a recognition result of the speech recognizer in the local client station is below a 
predetermined threshold. It may be preferred to route also earlier speech material to 
the server station, allowing the server station to better synchronize with speech signal, 
and optionally choose suitable recognition models, such as acoustic or acoustic 
language models base on earlier part of the signal (col. 9, lines 19 - 33). By only routing 
a part of speech signal, or optionally routing acoustic models based on the earlier part 
of the signal to the server station implies selecting between two signals and transmitting 
the selected one to a server, since the routed signal at the server is based on the 
performance of the local recognizer. 

3. Applicant argues that neither Thelen et al., nor Yang et al., teach that the control 
means of the server directs the signal either to the input signal modeling parameter 
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calculation means (and then to the recognition means) or to the recognition mean, 
depending upon the nature of the signal received from the terminal (Amendment, pages 
11 -13). 

The examiner disagrees, Yang et al., teach that the client side data will be 
transmitted to the C-DSR server as a message packet, wherein the message packet 
comprises configuration data and speech data. The configuration controller is used to 
generate a recognition adjustment parameter according to the configuration data, and 
subsequently sending the speech data to the C-DSR engine to proceed speech 
recognition (paragraphs, 26, 28 - 30). Using the configuration controller to adjust 
recognition parameters according the configuration data, and then sending the speech 
data for proceeding speech recognition implies directing the signal either to the input 
signal modeling parameter calculation means and then to the recognition means 
depending upon the nature of the signal received from the terminal, since the message 
packet received contains two types of data, configuration data and speech data. 

Claim Rejections - 35 USC § 102 

4. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 
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5. Claims 9, and 1 1 are rejected under 35 U.S.C. 102(a) as being anticipated by 
Thelen et al., (US Patent 6,487,534). 

As per claim 9, Thelen et al., teach a user terminal in a distributed speech 
recognition system comprising one server suitable for communication with said user 
terminal, said user terminal comprising: 

means for obtaining an audio signal to be recognized (fig. 7, elements 740, and 
750;col.1, line 7); 

first audio signal modeling parameter calculation means ("characterized by an 
HMM, whose parameters are estimated"; col. 5, lines 23 - 25); 

first control means for selecting at least one signal (a part of the speech input) to 
be transmitted to the server, from the audio signal to be recognized and a signal 
indicating the calculated modeling parameters (fig. 3, element 335; col. 8, lines 6-10). 

As per claim 1 1 , Thelen et al., further disclose that recognition means 
(recognition unit) to associate at least one stored form with the modeling parameters 
(estimated parameters; col. 5, lines 23 - 25). 



Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
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invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

7. Claims 1 - 8, 12 - 16 are rejected under 35 U.S.C. 103(a) as being unpatentable 

over Thelen et al., (US Patent 6,487,534) over Yang et al., (US PAP 2004/0044522) 

As per claim 1, Thelen et al., teach a distributed speech recognition system 

comprising at least one user terminal and at least one server suitable for communication 

with one another via a telecommunications network, wherein the user terminal 

comprises: 

means for obtaining an audio signal to be recognized (fig. 7, elements 740, and 
750; col.1, line 7); 

first audio signal modeling parameter calculation means ("characterized by an 
HMM, whose parameters are estimated"; col. 5, lines 23 - 25); 

first control means for selecting at least one signal (a part of the speech input) to 
be transmitted to the server, from the audio signal to be recognized and a signal 
indicating the calculated modeling parameters (fig. 3, element 335; col.8, lines 6-10); 
and 

wherein the server comprises: 

means for receiving the selected signal ("speech equivalent signal is received in 
the server station) originating from the user terminal (fig.7, elements 770, and 780; 
col.9, lines 53 - 54); 

recognition means (recognition unit) for associating at least one stored form with 
input parameters (estimated parameters; col. 5, lines 23 - 25). 
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However, Thelen et al., do not specifically teach a second input signal modeling 
parameter calculation means; second control means for controlling the second 
calculation means and the speech recognition mean in order if the selected signal 
received by the reception means is an audio signal, to activate the second parameter 
calculation means by addressing the selected signal to them as an input signal, and to 
address the parameters calculated by the second calculation means to the recognition 
means as input parameters, and if the selected signal received by the reception means 
indicates modeling parameters, to address said indicated parameters to the recognition 
means as input parameters. 

Yang et al., teach that the client side data will be transmitted to the C-DSR server 
as a message packet, wherein the message packet comprises configuration data and 
speech data. The configuration controller is used to generate a recognition adjustment 
parameter according to the configuration data, and subsequently sending the speech 
data to the C-DSR engine to proceed speech recognition (paragraphs, 26, 28 - 30; 
using the configuration controller to adjust recognition parameters according the 
configuration data, and then sending the speech data for proceeding speech recognition 
implies directing the signal either to the input signal modeling parameter calculation 
means and then to the recognition means depending upon the nature of the signal 
received from the terminal, since the message packet received contains two types of 
data, configuration data and speech data). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to generate adjusted model parameters based on message 
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packets received from the client as taught by Yang et al., in Thelen et al., because that 
would improve the speech recognition system by automatically classifies recognition 
results and their associated configuration data (paragraph 15). 

As per claim 2, Thelen et al., further disclose a voice activation means (spoken 
activation command) recognized in the form of extracts of an audio signal, outside 
speech segment of voice inactivity periods (col. 2, lines 15 - 20). 

As per claim 3, Thelen et al., further disclose that the first control means are 
adapted to select the signal to be transmitted to the server ("selecting a part of the 
speech input) from at least the original audio signal, the audio signal to be recognized in 
the form of segments extracted by the voice activation means and the signal indicating 
modeling parameters calculated by the first parameters calculation means ("estimated 
parameters; col. 5, lines 23 - 25; fig. 3, element 335; col. 8, lines 6 - 10; col.1, lines 57 - 
64). 

As per claim 4, Thelen et al., further disclose a voice activation means (spoken 
activation command) recognized in the form of extracts of an audio signal, outside 
speech segment of voice inactivity periods (col. 2, lines 15 - 20). 

However, Yang et al., teach that "the C-DSR server comprises a second control 
means (configuration controller) for controlling the second calculation means and the 
speech recognition means (generating a recognition adjustment parameter). The C- 
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DSR server receives message packets from the client mobile device, and generating 
adjusted speech recognition parameters according to the configuration data, and then 
returns a result to the client mobile device after completing the recognition task" 
(generating adjusted speech recognition parameters based on the message packet 
received and returning recognition result to the client mobile, suggest activating the 
second parameter calculation means; considering the audio signal as input signal, and 
modeling parameters as input parameters, since the server returns a result to the 
mobile client when the recognition task is completed, based on input signal received 
from the client; paragraph 18, lines 8-12; paragraph 19, lines 1 - 8). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to generate adjusted model parameters based on message 
packets received from the client as taught by Yang et al., in Thelen et al M because that 
would improve the speech recognition system by automatically classifies recognition 
results and their associated configuration data (paragraph 15). 

As per claim 5, Thelen et al., further disclose recognition means (recognition unit) 
for associating at least one stored form with the modeling parameters calculated by the 
first calculation means ("estimated parameters"; col. 5, lines 23 - 25). 

As per claim 6, Thelen further disclose that the first control means is adapted to 
select the signal to be transmitted to the server according to the result supplied by the 
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terminal recognition means ("selecting part of the speech signal via network to the 
server station in dependence on the outcome of the recognition"; col.1 , lines 57 - 64). 

As per claim 7, Thelen et al., further disclose storage means ("harddisk or ROM") 
adapted to store the audio signal to be recognized (col. 9, lines 63 - 64; col.10, lines 8 - 
13). 

As per claim 8, Thelen further disclose that the control means is adapted to 
select a signal to be transmitted to the server independently of the result supplied by the 
recognition means of the terminal ("the signal need not be directed to the local 
recognizer" implies transmitting to the server independently of the result supplied by the 
recognition means of the terminal; col. 8, lines 24, and 25). 

As per claims 10, and 12, Thelen et al., teach the system of claims 9, and 10. 
However Thelen et al., do not specifically teach that at least part of the parameter 
calculation means is downloaded from the server. 

Yang et al., teach "the C-DSR server receives message packets from the client 
mobile device, and generating adjusted speech recognition parameters according to the 
configuration data, and then returns a result to the client mobile device after completing 
the recognition task" (generating adjusted speech recognition parameters based on the 
message packet received, suggests that the parameter calculation means is 
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downloaded from the server, since adjusted speech parameters is generated on the 
server side; paragraph 18, lines 8-12). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to generate adjusted model parameters based on message 
packets received from the client as taught by Yang et al., in Thelen et al., because that 
would improve the speech recognition system by automatically classifies recognition 
results and their associated configuration data (paragraph 15). 

As per claim 13, Thelen et al., teach a server in a distributed speech recognition 
system comprising one server suitable for communication with said user terminal, said 
user terminal comprising: 

means for receiving from a user terminal, a signal selected at said terminal (fig. 
7, elements 740, and 750; col.1, line 7); 

input signal modeling parameter calculation means ("characterized by an HMM, 
whose parameters are estimated"; coL5, lines 23 - 25); 

recognition means (recognition unit) for associating at least one stored from with 
input parameters (estimated parameters; col. 5, lines 23 - 25). 

However, Thelen et al., do not specifically teach a second input signal modeling 
parameter calculation means; second control means for controlling the second 
calculation means and the speech recognition mean in order if the selected signal 
received by the reception means is an audio signal, to activate the second parameter 
calculation means by addressing the selected signal to them as an input signal, and to 
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address the parameters calculated by the second calculation means to the recognition 
means as input parameters, and if the selected signal received by the reception means 
indicates modeling parameters, to address said indicated parameters to the recognition 
means as input parameters. 

Yang et al., teach that the client side data will be transmitted to the C-DSR server 
as a message packet, wherein the message packet comprises configuration data and 
speech data. The configuration controller is used to generate a recognition adjustment 
parameter according to the configuration data, and subsequently sending the speech 
data to the C-DSR engine to proceed speech recognition (paragraphs, 26, 28 - 30; 
using the configuration controller to adjust recognition parameters according the 
configuration data, and then sending the speech data for proceeding speech recognition 
implies directing the signal either to the input signal modeling parameter calculation 
means and then to the recognition means depending upon the nature of the signal 
received from the terminal, since the message packet received contains two types of 
data, configuration data and speech data). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
the invention was made to generate adjusted model parameters based on message 
packets received from the client as taught by Yang et al., in Thelen et al., because that 
would improve the speech recognition system by automatically classifies recognition 
results and their associated configuration data (paragraph 15). 
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As per claims 14, and 15, Thelen et al., further disclose means for downloading 
voice recognition software resources via the telecommunications network to a terminal, 
the software resources including at least part of recognition means of the terminal (client 
station comprises communication means for communicating via the internet is formed 
by a combination of hardware and software implies means for downloading voice 
recognition software resources via the telecommunications network to a terminal; col. 7, 
lines 38-46). 

As per claim 16, Thelen et al., further disclose recognition means (recognition 
unit) for associating at least one stored form with modeling parameters (estimated 
parameters; col. 5, lines 23 - 25). 

Conclusion 

8. THIS ACTION IS MADE FINAL, Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
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the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Leonard Saint-Cyr whose telephone number is (571) 

272- 4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is (571)- 

273- 8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



LS 

07/10/07 




